Àá½Ã¸¸ ±â´Ù·Á ÁÖ¼¼¿ä. ·ÎµùÁßÀÔ´Ï´Ù.
KMID : 1022420180100010033
Phonetics and Speech Sciences
2018 Volume.10 No. 1 p.33 ~ p.38
Multi-resolution DenseNet based acoustic models for reverberant speech recognition
Park Sun-Chan

Jeong Yong-Won
Kim Hyung-Soon
Abstract
Although deep neural network-based acoustic models have greatly improved the performance of automatic speech recognition (ASR), reverberation still degrades the performance of distant speech recognition in indoor environments. In this paper, we adopt the DenseNet, which has shown great performance results in image classification tasks, to improve the performance of reverberant speech recognition. The DenseNet enables the deep convolutional neural network (CNN) to be effectively trained by concatenating feature maps in each convolutional layer. In addition, we extend the concept of multi-resolution CNN to multi-resolution DenseNet for robust speech recognition in reverberant environments. We evaluate the performance of reverberant speech recognition on the single-channel ASR task in reverberant voice enhancement and recognition benchmark (REVERB) challenge 2014. According to the experimental results, the DenseNet-based acoustic models show better performance than do the conventional CNN-based ones, and the multi-resolution DenseNet provides additional performance improvement.
KEYWORD
convolutional neural network, DenseNet, multi-resolution, speech recognition
FullTexts / Linksout information
Listed journal information
ÇмúÁøÈïÀç´Ü(KCI)